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Abstract 

Recent evidence using compulsory schooling laws as instruments for education sug- 
gests that education has a causal effect on mortality (Lleras-Muney, 2005). However, 
little is known about how exactly education affects health. This paper uses compul- 
sory schooling laws to try to identify how education impacts health and to indirectly 
assess the merit of using these laws to infer the causal effect of education on health. 

I find that previous Census mortality results are not robust to the inclusion of state- 
specific time trends but that robust effects of education on general health status can 
be identified using individual level data in the SIPP. However, the pattern of effects 
for specific health conditions in the SIPP appears to depart markedly from prominent 
theories of how education should affect health. I also find that vaccination against 
smallpox for school age children may account for some of the improvement in health 
and its association with education. These results raise concerns about using early 
century compulsory schooling laws to identify the causal effects of education on health. 

*I thank Douglas Almond for many helpful discussions and for contributing to some of the ideas in this 
paper. I also thank participants at the 2006 NBER Spring Health meetings for many helpful comments. I 
also thank Adriana Lleras-Muney for sharing her computer code, data and for helpful discussions. Similarly 
I thank Claudia Goldin for sharing her data. The views expressed here do not necessarily reflect those of 
the Federal Reserve system. 
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1 Introduction 



Social scientists have long been aware that there is a strong association between education 
levels and health (Kitagawa and Hauser 1973) but much less is known about how these 
factors are connected, and whether the relationship is in fact, causal. As Richard Suzman 
of the National Institute on Aging recently stated, "Education ...is a particularly powerful 
factor in both life expectancy and health expectancy, though truthfully, we’re not quite 
sure why." (Lyman 2006). A recent study by Lleras-Muney (2005) provides perhaps the 
strongest evidence that education has a causal effect on health. Utilizing a more compelling 
research design than most previous work, Lleras-Muney uses state compulsory school and 
child labor laws as instruments and finds that increased schooling for those born in the first 
quarter of the twentieth century, led to dramatic reductions in mortality rates during the 
1960s and 1970s. Her IV point estimates imply that an additional year of schooling reduces 
mortality risk by between 30 and 60 percent.^ These results were prominently featured in a 
front page story in the New York Times entitled "A Surprising Secret to a Long Life: Stay 
in School" (Kolata 2007) 

The mechanism by which schooling improves health, however, remains illusive. A variety 

of theories have been proposed to explain how schooling might improve health. These 

theories emphasize the role of eduction in affecting various proximate determinants of health. 

Health determinants leveraged by education include: (i) financial resources, (ii) decision 

making ability, (iii) time preference. Lleras-Muney (2005) found that adjustment for income 

or occupation did not alter her IV estimates, and therefore discounted the role of resources 

behind her findings. Instead, her results appeared more consistent with a role of “critical 

thinking skills." Such skills may allow one to utilize advances in medical technology (e.g., 

died and Lleras-Muney (2003)) or manage chronic conditions better (e.g. Goldman and 
^Lleras-Muney uses the Census to estimate ten year mortality rates for synthetic cohorts using state of 
birth, year of birth and sex. The mean death rate in Lleras-Muney’s sample is about 10 percent and her IV 
estimates are as high as -0.06 implying that a 1 year increase in schooling would lower the death rate by as 
much as 6 percentage points. 
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Smith (2002)). These hypothesized mechanisms both reference the ability to obtain and 
process information, which may be improved through education.^ 

In this paper I examine whether compulsory schooling laws can provide insights into 
exactly how education may improve health. As part of the analysis I also reassess whether 
compulsory schooling laws can be used to draw inferences about the causal effects of edu- 
cation on health. As is well known, there is no test of instrument exogeneity in the exactly 
identihed case.^ I therefore conduct some (necessarily indirect) exercises to explore the 
validity of compulsory schooling law instrument. 

I begin by revisiting the mortality results in Lleras-Muney (2005) by adding significantly 
more data (e.g. doubling the 1970 Census sample and quintupling the 1980 sample) and 
employing several robustness checks. The key hnding is that the Census mortality results 
are not robust to the inclusion of state-specific time trends. This raises the concern that 
the instruments might be picking up smooth cohort trends in educational attainment rather 
than discrete increases induced by more stringent compulsory schooling laws.^ I also find 
that the effects of education on mortality appear to be driven primarily by the earliest 
cohorts (born 1901-1912) during the 1960 to 1970 period. 

In a second exercise I use a new microdataset: the Survey of Income and Program 
Participation (SIPP). The SIPP directly queries the health of each respondent, while the 
Census must be aggregated by synthetic cohort (by state of birth, sex and year of birth) in 
order to impute mortality by using estimates of "missing" individuals. The latter estimates 
arguably could be due in part, to a selection effect if less educated individuals are also less 
likely to be captured by the Census over time. In contrast to the synthetic cohort approach, 
with the SIPP we can be sure that those who were “treated" by the compulsory school 

laws are indeed the same individuals registering the change in health. Moreover, as will be 

^In addition Lleras-Muney could not rule out possibility that education lowered individual discount rates, 

and thereby lead to healthier behaviors. See (Grossman 2005). 

^And the IV estimator is inconsistent when the instrument is endogenous. 

also discovered a coding error that generated some erroneous OLS and IV estimates in Lleras-Muney 

(2005) that are quantitatively meaningful. I discuss this in more detail in section 4 
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shown later, the SIPP microdata provides relatively strong statistical power in assessing the 
relationship between education and health. 

Using the SIPP with the same IV strategy, I hnd large and statistically significant effects 
of education on health that are robust to the inclusion of state-specific time trends. In 
particular, I document that the summary health measure from the SIPP - self-reported 
health - also improves with changes in schooling (as induced by compulsory school laws). 
So while there may be some uncertainty about the robustness of the Census mortality results 
based on group level data, the evidence is quite strong with the individual level data using 
health status as an outcome. This addresses the concern about whether the health effects 
were due to idiosyncratic changes in the laws. 

I then use the SIPP to examine a broad range of health measures in order to isolate which 
specific health conditions responded to education improvements induced by compulsory 
schooling laws. This is potentially useful for understanding whether or not the use of 
compulsory schooling laws as instruments produce sensible results that accord with the 
leading hypothesized mechanisms for how education affects health. If for example, all of 
the health effects are concentrated in only one or two health conditions that are unrelated 
to improvements in medical technology or decision making ability, it might cast some doubt 
on the validity of the instruments. If on the other hand, we were to assume that the 
instruments were valid, the results ought to be informative about the critical question of 
how exactly higher education levels lead to better health. 

In fact, I find that among the nineteen health conditions examined, only four show 
significant declines in incidence due to education. What is striking is the absence of effects 
among the many health conditions where decision-making ability is believed pivotal. For 
example, no effect is found for chronic diseases such as arthritis, cancer, heart disease, lung 
disease, or stroke incidence. The sole exception is diabetes, where the ability to maintain 
a treatment regime is especially important. Moreover, education is found to increase the 
likelihood of hypertension and kidney problems: conditions for which self-management and 
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recent technological advance appear to be important. The lack of any effects across most 
outcomes also suggests that channel underlying the connection between education and health 
is probably not due to financial resources or unobserved time preferences which would tend 
to improve health across the board. 

What then accounts for the positive relationship between schooling and self-reported 
health using these instruments? Surprisingly, health conditions where decision-making ap- 
pears comparatively unimportant underly the relationship. Sensory functions - in particular 
hearing and vision - exhibit large and signihcant impacts when using compulsory school laws 
as instruments. I also find that education reduces stiffness or deformity of the limbs, back 
problems, senility, and improves the ability to speak in the IV specifications. This pattern 
of effects suggests that either: (i) the mechanisms by which schooling impacts health depart 
markedly from those hypothesized, or (ii) the use of compulsory school laws as an instrument 
may be suspect. 

An important caveat is that that with the SIPP I am using a sample of individuals 
who have survived into their later years (between the ages of 59 and 83) where presumably 
there has already been considerable positive selection on education and health. Of course 
among this more selected sample we might expect there to be a bias against detecting any 
health effects, so the finding of a strong effect on overall health status might be considered 
surprising. Nonetheless the age of the sample raises questions about the extent to which 
these results generalize to the broader population. 

Finally, I hypothesize that schooling law changes may be correlated with other con- 
temporaneous policies either inside or outside of schools that improved long-term health. 
During the early period of the twentieth century there were fairly dramatic improvements 
in public health measures and large declines in concurrent mortality.^ There was also a 
recognition that compulsory schooling was useless if students were mentally or phsycially 

unfit to attend school. This led to other reforms in the schools that were designed to 
® Cutler and Miller (2004), for example argue that the introduction of clean water technologies early in 
the century can account for half of the reduction in mortality in large cities during that time. 
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improve children’s wellbeing. In a third exercise I examine one potential factor that might 
account for the observed relationship between schooling laws and improved health, small- 
pox vaccination. The vaccination of children against smallpox as a requirement for school 
entry is likely to be correlated with years of education and plausibly exerts effects on adult 
health outcomes. Data on smallpox incidence and vaccination rates are thin, preventing 
definitive conclusions. Nevertheless, I find that states that appeared to have more stringent 
vaccination requirements for school entry experienced most of the gains in long-term health 
generated by compulsory schooling laws. The fact that survivors of smallpox are known to 
suffer from compromised vision, hearing and speaking provides some additional suggestive 
evidence of a possible link between vaccination requirements and the estimated long-term 
health effects of compulsory schooling laws. 

The remainder of the paper is organized as follows. In section 2 I review the relevant 
literature, in section 3 the Census and SIPP data are described the econometric models are 
shown, in section 4 the baseline results are presented, in section 5 I consider the possible 
role of smallpox and vaccination in schools and in section 6 I conclude. 

2 Literature Review 

It has been over thirty years since Grossman published his seminal economic model of 
health determination (Grossman 1972). This model includes the assumption that education 
increases the efficiency of health production. And while Grossman’s conceptual framework 
has served as the “work horse" model for applied work in health economics, little is under- 
stood about how or what kinds of education enable the production of health.® For example, 
in 2003, the National Institutes of Health solicited (quite general) research proposals on the 
“Pathways Linking Education to Health." 

This RFA sought “validation of specific measures of abilities crucial to educational attain- 
® Grossman (2005) noted that “extensive reviews of the literature [concluded that] that years of formal 
schooling completed is the most important correlate of good health. 
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merit, such as level of cognitive or language skills" that improved health, and even cautioned 
that: “The association or pathway between formal education and either important health 
behaviors or diseases may not be causal. Instead it may reflect the influence of confounding 
or co-existing determinants or may be bi-directional. " (NIH 2003) 

Recent years have witnessed an upsurge of interest in education’s role in determining 
health. In one widely-cited paper, Goldman and Smith (2002) noted that more educated 
patients may manage chronic conditions better. Those with more schooling adhere more 
closely to treatment regimens for HIV infection and diabetes, which can be fairly complex. 
For such conditions, the ability to form independent judgements and comprehend treatments 
is important, and apparently is fostered by schooling. Accordingly, “self-maintenance is an 
important reason for the very steep SES gradient in health outcomes" (Goldman and Smith 
(2002):10934). 

Glied and Lleras-Muney (2003) looked at health conditions experiencing more rapid 
technological change, finding that more educated respondents faired better. They argued 
that “the most educated make the best initial use of new information about different aspects 
of health" permitting them to respond more adeptly to evolving medical technologies. They 
noted that no consensus measure existed for assessing the pace of innovation in health. 
They therefore consider several measures, including the change in mortality rates for specific 
conditions from 1986 to 1995 and the number of patents issued for particular conditions. 
They found that education gradients were steeper for diseases that were more innovative by 
these measures. 

A growing literature has also tried to examine whether the education gradient in health 

is causal by using instrumental variables. Reviews of these studies may be found in (Lleras- 

Muney 2005) and (Grossman 2005). While these studies typically find an effect of more 

education leading to better health, in most cases it is questionable whether the instruments 

are truly exogenous.^ In contrast, the use of changes in compulsory schooling laws appears 
'"For example Leigh and Dhir (1997) use parent schooling, parent income and state of residence as 
instruments, all of which could plausibly affect long-term health independently of their effects through 
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to be a more compelling instrument choice since it is more plausibly exogenous than instru- 
ments used in prior work. Previous studies also typically have looked at just one or two 
health outcomes and have not systematically compared the effects across a range of health 
outcomes to distinguish between competing theories of how education affects health. 



3 Data and Methodology 

3.1 Mortality Data and Econometric Specification 

I begin by describing the procedure used to estimate the effects of education on mortality 
in Lleras-Muney (2005). This will provide the basic framework for extending the analysis 
to examining other health outcomes in the SIPP and for expanding the analysis along 
other dimensions. I briefly describe the approach here, for a more detailed discussion that 
includes alternative estimation strategies see Lleras-Muney (2005).® The key idea is that in 
the absence of a large sample tracking individuals over their entire lifetime, synthetic cohorts 
are constructed using Census data. With the Census, we know age, completed education 
and state of birth which allows us to infer the compulsory schooling laws that affects each 
cohort in each state of birth. Mortality can be measured by tracking population counts of 
particular groups across Census years. The mortality rate at time t for cohort c, of gender, 
g, born in state s, (Mcgst) is simply measured as the percentage decline in the population 
count (Ncgst) within these cells over the subsequent ten years: 



Mcgst — 



^cgst ^cgst+1 
^ cgst 



( 1 ) 



schooling. 

* Lleras-Muney also uses several other approaches. She estimates the model at the individual level using 
data from the National Health and Nutrition Examination Survey (NHNES). This is largely for comparative 
purposes since the sample is too small to estimate statistically significant effects using IV. She also considers 
Wald estimators and introduces a a "mixed" two stage least squares approach using individual data in the 
first stage but aggregate data in the second satge. The latter two approaches produce roughly similar 
estimates to the aggregate IV estimates which are modeled here. 




The mortality rate for each cell is then modeled as follows: 



Mcgst — a + EcgstT^ + Wcs^ + 7c + + 9 cr + Tt + d + Scgst (2) 

where E^gst is the average education level for that cell at time t, Wcsi measures a set of 
cohort and state specific controls measured at age 14 intended to capture differences in other 
potential early life determinants of mortality (e.g. manufacturing share of employment, 
doctors per capita). The model also includes a set of cohort dummies 7 ^, state of birth 
dummies Og, interactions between cohort and region of birth, 9cr^S‘ female dummy d and 
year dummies, t*. 

I construct two datasets for the analysis. The first attempts to replicate Lleras-Muney 

point estimates and uses the same 1 percent ipums samples drawn from the 1960, 1970 

and 1980 Censuses that are produced by the Minnesota Population Center.® The second 

estimation sample replaces the 1970 1 percent sample with a 2 percent sample that combines 

the forml and form 2 "state" 1 percent samples.^® For 1980 1 use a 5 percent sample. In 

addition, I also add the 5 percent samples for 1990 and 2000. All of the Censuses are scaled 

appropriately to produce population estimates that correspond to A"cgsi^^- 

Following Lleras-Muney I restrict the anlaysis to cohorts born between 1901 and 1925. 

I also follow her sample restrictions to exclude immigrants, blacks, and to topcode years of 

education at 18 starting in 1980. For the expanded samples I also exclude cases where age, 

state of birth and education are imputed by the Census Bureau. The descriptive statistics 

for both samples are shown in Table 1. It is worth noting that the death rate for the 1970 to 
"Ruggles et al. [2004]. 

^"Unfortunately, combining any of the other four 1 percent samples that are available for 1970 would lead 
to geographically unrepresentative samples. 

'^^The 1960, 1970 and 1980 samples are self-weighting samples so the raw population counts can simply 
be scaled up by multiplying by 100, 50, and 20. The 1990 and 2000 samples require weights to produce 
representative estimates of the population. We found that using the self-weighting 1 percent samples for 
1990 and 2000 instead of the 5 percent samples had little effect on the point estimates but increased the 
standard errors. 
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1980 period is quite a bit larger with the expanded sample but that the standard deviation 
is about 20 percent lower. There are now also 5 additional cells that had missing data when 
using just the 1 percent samples. The death rates for the 1980-90 and 1990-2000 periods 
are much higher due to the fact that I follow these same cohorts when they are much older. 
Figure 1 plots the death rates by age for each Census year. This highlights the importance 
of controlling for age in the specifications which is done by adding polynomials in age to the 
models. 

One straightforward way to estimate tt in (2) would be through weighted least squares 
(WLS), with the weights corresponding to the population represented by each cell. However, 
that this would produce a biased estimate due to omitted variables. Any number of factors 
could plausibly be associated with both higher education and lower mortality even at the 
group level. Therefore, two stage least squares is used where in the first stage education is 
instrumented with the set of compulsory schooling laws, CLcs, in place for each cohort and 
state of birth: 



Ecgst — b + CLcsP + XcgstPiy + Wcs^iv + 7it),c + ,s ^iv^cr + Tt + Ucgst (3) 

The instruments for the compulsory schooling laws are constructed in the following way. 

The variable childcom measures the minimum required age for work minus the maximum 

age required for a child to enter school, by state of birth and by the year the cohort is 

age 14. This variable takes on one of eight values (that range from 0 to 10). A set of 

indicator variables (excluding the 0 category) are used as instruments. In addition there is 

an indicator for whether school continuation laws were in place in that state. These laws 

required workers of school age to continue school part-time. For comparability, I use the 

same dataset as Lleras-Muney (2005).^^ In addition to a more detailed description of these 

variables in Lleras-Muney (2005), there is also an extended discussion of these measures and 

their appropriateness as instruments in Lleras-Muney (2001). The variables contained in 
Downloaded from Lleras-Muney’s website. 
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Wes are the corrected versions of those used by Lleras-Muney (2005). 

I also experiment with a second set of data independently collected by Goldin and Katz 
(Goldin and Katz 2003). Goldin and Katz carefully compared their series with the Lleras- 
Muney data and the data collected by Acemoglu and Angrist (2000) and rectified differences 
wherever possible. Since the Goldin and Katz data go back further in time it is possible to 
match all of the cohorts to the school entry age laws in effect when the cohorts were younger 
than 14. I use this data to measure the required age for school entry when the cohorts were 
likely to be at age 8 instead of age 14. In principle, incorporating this data should provide 
a better measure of the total years of compulsory schooling. 

3.2 Health Microdata and Specifications 

The second sample is constructed by pooling individuals across various panels of the Survey 
of Income and Program Participation (SIPP) during the 1980s and 1990s. Because partic- 
ipation in many programs is closely related to an individual’s health and disability status, 
the SIPP routinely collects information on health and medical conditions. The SIPP is also 
ideally suited for this analysis because it contains the state of birth of all sample mem- 
bers allowing us to implement the IV strategy of using compulsory schooling laws during 
childhood. 

One useful outcome is self-reported health (SRH). SRH is on a 1 to 5 scale where 1 
is “excellent”, 2 is "very good", 3 is "good", 4 is "fair" and 5 is “poor”. SRH has been 
found to be an excellent predictor of mortality and changes in functional abilities among 
the elderly. (Gase, Lubotsky, and Paxson 2002). I experiment with this measure in a few 
ways. First it is simply used as a continuous variable. Second, I use indicators for being 
in poor health or, fair or poor health. Finally I use the health utility scale that scales 
the differences between the categories based on a health model using the National Health 
Interview Survey. 

^^This includes the 1984, 1986-1988, 1990-1993 and 1996 panels. 

^^See (Johnson and Schoeni 2005) and the citations there for a discussion of this approach. 
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A few other general outcomes are also examined. These include whether the individual 
was hospitalized during the past year, the number of times she was hospitalized, the total 
number of nights spent in the hospital and the number of days spent in bed in the last four 
months. 

There are also a set of questions dealing with functional activities, activities of daily living 
and instrumental activities of daily living. I assembled a common set of questions that were 
consistently asked across surveys. These include whether the individual has “difficulty" with 
seeing, hearing, speech, lifting, walking, climbing stairs and, whether the person can perform 
any of these activities "at all". In addition there is information on whether individuals have 
difficulty getting around inside the house, going outside of the house or getting in or out of 
bed, and whether they need the assistance of others for these activities. 

For a subset of individuals who report limited abilities in certain tasks or who have been 
classihed as having a work disability, detailed information is collected on a number of very 
specihc health conditions including: arthritis or rheumatism; back or spine problems; blind- 
ness or vision problems; cancer; deafness or serious trouble hearing; diabetes; heart trouble; 
hernia; high blood pressure (hypertension); kidney stones or chronic kidney trouble; men- 
tal illness; missing limbs; lung problems; paralysis; senility/dementia/Alzheimer’s disease; 
stiffness or deformity of limbs; stomach trouble; stroke; thyroid trouble or goiter; tumors 
(cyst or growth); or other. Since the specific health ailments are only asked of specific 

subsamples, they probably only pick up on the most severe cases. Even though, many of 

These measures are derived from specific codes of the International Classification of Impairments, Dis- 
abilities and Handicaps (ICIDH). 

'^^I pool responses from the 1984, 1990-93 and 1996 SIPPs in order to maximize sample size. Unfortunately 
different criteria were used across the SIPP survey years to select the subsamples for which specific health 
conditions were asked. For example, in 1996 the health conditions were asked of those who reported being 
in fair or poor health. I found that it was important to combine all of the subsamples in all of the years in 
order to have enough power to identify effects. There are also an additional set of 10 outcomes that are not 
used because they were not available in the 1984 SIPP. Experimentation with a smaller sample suggests 
that the conclusions are not altered by dropping these other outcomes 



12 




our sample individuals are not actually asked about these specific health conditions we still 
include them in the estimation sample so that our sample is not a selected sample of only 
those in poor health. The summary statistics for this data are shown in Table 2. 

Since most of the outcomes in the SIPP are indicator variables I now use Two Stage 
Conditional Maximum Likelihood or 2SCML (Rivers and Vuong 1988) rather than IV. 
Rivers and Vuong show that 2SCML has desirable statistical properties, is easy to implement 
and produces a simple test for exogeneity. I continue to use IV for the few continuous 
dependent variables. Also all of the analysis is now done using individual level data. The 
statistical model is similar to (2) only now I use the latent variable framework: 



y*^ = a + EiTT + Xi/3 + Wcs5 + + «« + trends + Tt + eu (4) 



y,t = 1 if y*it > 0, yit = 0 if y*t<t) 



( 5 ) 



In the first stage, I run a similar regression as before: 



EiTT = b + CLcsP + Xi(3 + Wcsd + 7c + + trends + Tt + Su (6) 

To implement 2SCML, I use the predicted residuals from (6), eu, and include it as an 
additional right hand side variable (along with the actual value of E^) when running the 
second stage probit. For comparability I use the same sample restrictions and covariates 
as Lleras-Muney with only a few exceptions. Unlike Lleras-Muney I include a quadratic 
in age. In addition state-specific cohort trends are used to address concerns that region of 
birth interacted with cohort may not adequately control for state-specific factors that are 
smoothly changing over time.^® 

thank Jay Bhattacharya for this suggestion. In a previons version of the paper I found very similar 

results using two stage least squares for the dichtomous outcomes. 

generally found that the IV results were larger and more significant when using the state trends than 

when using region of birth interacted with cohort. The OLS results were virtually identical under either 
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4 Baseline Results 



4.1 Mortality 

This section begins by describing the replication of Lleras-Muney (2005) and the discovery 
of a coding error concerning the appropriate base period. Using the correct base period 
increases the point estimates: the new WLS estimate is approximately doubled and the IV 
estimate is 50% higher than reported in Lleras-Muney (2005). I then expand the Census 
data by: (i) expanding beyond the 1% sample to the 2% and 5% IPUMS samples available 
for 1970 and 1980, respectively, and (ii) incorporate the 1990 and 2000 Census data in the 
mortality analysis. 

4.1.1 Replication and Correction 

For the mortality analysis I start with the same sample as Lleras-Muney (2005) I have an 
identical number of individuals (814,805) drawn from the 1960 and 1970 Census and match 
nearly all the summary statistics in her Table 1 Nevertheless I find some large differences 
when implementing WLS or IV at the cell level. This shown in Table 3. The first two 
columns show the WLS and IV estimates from Lleras-Muney while columns 3 and 4 show 
my estimates. Compared to her WLS estimate of -.017, I obtain a coefficient of -.036. The 
difference in the estimates is statistically significant. For IV, once again I obtain a much 
larger estimate (-.072) than her estimate of (-.051). 

After some experimentation I speculated that Lleras-Muney did not use education calcu- 
lated in the base period, t, but instead calculated education in period t + 1. If I use education 
calculated in t -I- 1, then my estimates are much closer to hers. These are shown in columns 
5 and 6. After graciously providing her computer code I confirmed with Lleras-Muney that 
this was indeed the case and she has since written an errata (Lleras-Muney 2006) providing 
corrected estimates.^® The problem with using education in the later period is that the 

specification. 

^^The errata (Lleras-Muney 2006) reports an IV coefficient of -0.063 (0.024) compared to the estimate of 
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sample has already experienced selective mortality based on education. 

The discrepancy between the results are not only statistically meaningful but are quan- 
titatively important. Taken at face value, the "corrected" IV result implies that an extra 
year of schooling reduces the likelihood of dying over the next ten years by more than 7 
percentage points. In the sample, the mean death rate is only about 10.6 percent. This 
suggests that one more year of schooling lowers mortality risk by nearly 70 percent -a result 
that is perhaps implausibly large. 

4.1.2 Expansion of the Censns Sample 

In Table 4, I show how the results change as the sample is enlarged and the specifications 
are modified. I begin by just focusing attention on the first two columns showing the WLS 
and IV estimates. In panel A I isolate the effects of using larger samples for 1970 and 1980. 
Row 1 repeats the results from Table 3. In row 2 I find that the WLS estimate rises to -0.045 
and that the IV estimates drop considerably to -0.043. The greater precision is evident in 
the standard error for the IV estimate which declines by about 25 percent. In rows 3 and 4 
I control for age and find that this lowers the WLS estimates a little and increases the IV 
estimates a little. In row 5 I drop the region of birth interactions with cohort and instead 
use state specific linear (cohort) trends. This raises the WLS estimate but I now find that 
IV coefficient is significantly lower and is no longer statistically significant at conventional 
levels. 

In panel B I add data from the 5 percent samples of the 1990 and 2000 Censuses. With 

this larger dataset I construct death rates over four ten year periods and therefore follow 

cohorts over a longer period of time with a considerably larger sample. Given that the 

sample also tracks the cohorts later in life when mortality rates are much higher, the age 

controls are essential. I use a cubic in age although I find that the results are not very 

sensitive to the choice of the polynomial. Since medical technology and other health-related 

factors might change over time, I have also interacted the cubic in age with the Census 
-0.072 reported here. I was unable to resolve this difference. 
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year. In my preferred specification (row 6) I now find that both the WLS and IV estimates 
are about -0.035 which appear to be much more plausible. With this larger sample the 
inclusion of state specific cohort trends again results in a point estimate that is much smaller 
in magnitude (-0.02) and not statistically distinguishable from zero (row 7). 

The third column shows the effects of using the Goldin and Katz data for constructing 
the instruments. For most specifications in panels A and B they produce similar estimates 
as the baseline IV results although the standard errors are a bit higher. It is worth noting 
that with the Goldin and Katz data the inclusion of state specific cohort trends lowers the 
size of the point estimates even more dramatically and also yields the same conclusion, that 
the estimates are not statistically different from zero. 

4.1.3 Effects by Subgroups 

In the remaining panels of Table 4 I examine how the effects vary by year, age and cohort. 
In panel C I separately estimate the education coefficient for each Gensus year. Since the 
specification includes a full set of cohort dummies these are equivalent to age controls when 
using a single Gensus year. Although the WLS estimates are significant in all years they 
peak in 1970 at -0.061 and drop to only -0.015 by 1990. The IV estimates have large standard 
errors so they are likely to be imprecisely estimated. Nonetheless the estimates are large 
only for 1960 and are essentially zero for 1980 and 1990. This is true whether I use the 
Lleras-Muney data or the Goldin Katz data. In panel D I stratify the sample by three age 
ranges: 35-55, 55-65 and 65-89. Here I observe different patterns across the three columns 
making it difficult to interpret the estimates. The WLS and IV estimates from column 2 
suggest that the largest effect may be for those aged 55 to 64, while the IV estimates with 
the Goldin Katz data suggest the opposite. Given the imprecision of the estimates I cannot 
draw any meaningful inferences regarding the age pattern 

Panel E however, provides a striking result that appears to be consistent across the two 
IV specifications. It appears that the entire effect of education on mortality arising from 
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compulsory schooling laws is due to cohorts born from 1901 to 1912 who constitute just 
over 40 percent of the sample. In fact for those born from 1913 to 1925, the point estimate 
is actually positive in both column 2 and column 3. Using the Lleras-Muney data, the 
estimate is 0.035 and is significant at the 7 percent level. These results taken as a whole 
suggest that upon closer inspection, the results from Lleras-Muney are driven by cohorts 
born very early in the century and their mortality experience during the 1960-1970 period. 
One possible explanation could be that the effect of education stayed roughly constant but 
that compulsory schooling laws had its biggest bite for those born earlier in the century. 
However, I have run the first stage regressions by these cohort groupings and found that the 
partial F-statistics on the instruments are similar for both cohorts when using the Lleras- 
Muney data and are actually higher for the 1913 to 1925 cohorts when using the Goldin 
Katz data. This suggests that the schooling laws may actually have been more binding for 
the later cohorts casting doubt on this alternate explanation. In other estimates that are 
not shown in the table I found no statistically significant difference between men and women 
although the point estimates were larger (in absolute value) for men using the Lleras-Muney 
data and very imprecisely estimated using the Goldin Katz data. 

4.2 Health Outcomes from the SIPP 

In Table 5 the results using the microdata on health outcomes using the SIPP are presented. 
The first column shows the effects of education using a simple probit (or OLS) which does 
not account for endogeneity. The second column presents the 2SCML (or IV) estimates 
using the compulsory schooling laws as instruments. Given the possible effects of education 
on mortality and the fact that outcomes in the SIPP are not observed until at least 1984, 
one might not expect any remaining health effects to be apparent. As it turns out I do find 
significant effects using the instruments for several broad health outcomes. The first row 
shows that self reported health measured as a continuous variable is affected by education. 
The IV estimate of -0.23 is more than twice the OLS estimate (-0.09). In column 4 using 
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a Hausman test one can reject that the OLS and IV coefficients are the same at the 7 
percent level. Translating SRH into a health index on a 1 to 100 scale following Johnson 
and Schoeni’s (2005) approach, the IV estimate implies that an increase in schooling by 
one year improves the health index by 4.5 points or about 7 percent evaluated at the mean 
(column 3). I also estimate that the probability of being in fair or poor health is reduced 
by 8.2 percentage points with an additional year of schooling -a fairly large effect that is 
statistically different from the naive probit at the 18 percent signficance level. I do not find, 
however, that any of the measures of hospitalization or days spent in bed are significant 
when accounting for endogeneity. 

Looking across a variety of measures of physical functional abilities, I hnd that while 
all of the naive probit estimates are signficiant and of the expected sign, the two stage 
estimates are typically not significant. Given the large health effects discussed above it is 
striking that those who have an additional year of schooling due to compulsory schooling 
laws are no more likely to have trouble lifting, walking, climbing stairs, getting around the 
house, getting around inside the house or getting into or out of bed. In fact for many of 
these outcomes the coefficients are actually positive! On the other hand, those with greater 
schooling associated with compulsory schooling laws are dramatically less likely to experience 
problems with vision, hearing or speaking. In almost all of these cases the differences between 
the simple probit and the 2SCML estimates are very large and statistically different at about 
the 10 percent level. For example, the 2SCML estimates imply that an additional year of 
schooling reduces the probability of having trouble "seeing" by 5.6 percentage points. In this 
sample the mean rate of this health outcome is 13.6 percent. These results might suggest 
that the channel by which general health is compromised for those with less schooling, may 
be related to sensory functions. 

The next set of results estimate the incidence of specific health conditions. Recall that 
these conditions are only identihed for subsets of individuals and that the screening criteria 
has changed across SIPP survey years. Also recall that all individuals are included regard- 
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less of whether they were screened for this question so as to avoid using a selected sample 
of only those in poor health. Generally, the underlying health conditions were only asked 
of individuals who reported particular kinds of activity limitations, reported having a work 
disability or reported being in fair or poor health. This is captured by the variable "difh- 
culty" which, not surprisingly, is signficiant under both probit and 2SCML. When I turn 
to the estimated likelihood of having one of the underlying health conditions, the probit 
estimates once again are significant in every case. The 2SCML estimates, however, are only 
negative and signficant for four outcomes: back or spine problems; stiffness or deformity of a 
limb; diabetes and senility/dementia/Alzheimer’s disease. It is important to point out that 
"trouble seeing", "trouble hearing" and "trouble speaking" were never used as a screening 
criteria for asking about an underlying health condition. This likely explains why blindness 
and deafness are not significant with the subsamples. 

Another interesting result is that both kidney problems and hypertension appear to 
positively associated with more schooling. This is especially notable because these are 
two outcomes for which self-management and recent technological advance appear to be 
especially important. According to Appendix Table B of Glied and Lleras-Muney (2003), 
treatment of kidney infections experienced substantial innovation. Among the 56 causes 
of death, it experienced the fastest decline in age-adjusted mortality from 1986 to 1995 - 
falling at more than nine percent per year (Glied and Lleras-Muney (2003)):8, Appendix 
Table B). Accordingly, a steep (negative) gradient between education and kidney disease 
would presumably be expected. It is therefore of note that the 2SCML specification finds an 
increase in the incidence of kidney problems among those with high education. Treatment of 
diabetes is “often considered the prototype for chronic disease management." Goldman and 
Smith (2002). Our findings, which analyze a broad range of health conditions and chronic 
diseases, would suggest that insofar as the formal schooling is concerned, diabetes appears 
to be an exception. In the SIPP data, only diabetes enters enters in the expected direction 
- i.e. increases in schooling appear to reduce diabetes incidence. An alternate explanation 
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for the diabetes result could be that states that had higher compulsory schooling levels also 



promoted nutritional policies that might have reduced adult onset of diabetes. Overall, 
however, one conclusion that may be drawn from this table is that there is little support for 
the "decision-making" hypothesis. 

It is also worth noting that explanations for the link between education and health that 
focus on resources (e.g. income, occupation) or unobserved time preferences do not appear 
to be consistent with these results. These explanations would likely imply that all outcomes 
ought to be affected not just a few. 

The major caveat to this analysis is that we observe individuals only if they have survived 
into the 1980s and 1990s when they are anywhere between the ages of 59 and 83 This 
sample is almost certainly positively selected on education and health, making it unclear 
how generalizable these results are. I suspect that due to this selection the results are 
biased against finding any effects of education on improving health, making it still surprising 
why there are very large negative coefficients on the incidence of several negative health 
outcomes. 



5 Smallpox Vaccination 

5.1 Alternative Explanations 

The results thus far suggest present something of a puzzle as to exactly how compulsory 
schooling laws early in the twentieth century led to improved long-term health status. While 
the results cast doubt on the traditional explanations offered in the literature of how edu- 
cation improves health the results do not appear to point to any obvious alternative expla- 
nation. One general hypothesis worth considering is that schools served as an important 
place for implementing a variety of policies that may have impacted both education and 
health directly. It could be that states and cities during this period were introducing many 
reforms contemporaneously and schools were one obvious target for these reforms. 
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In fact it was noted at the time that it was pointless to force kids to attend schools if they 
were unable to learn. In 1904, Robert Hunter wrote in the book Poverty: "There must be 
thousands -very likely sixty or seventy thousand children-in New York City alone who often 
arrive at school hungry and unfitted to do well the work required. It is utter folly, from the 
point of view of learning, to have a compulsory school law which compels children, in that 
weak physical and mental state which results from poverty, to drag themselves to school 
and to sit at their desks, day in and day out, for several years, learning little or nothing." 
In fact in response to this situation Philadelphia, Boston, Milwaukee, New York, Cleveland, 
Cincinnati and St. Louis all began large scale programs to provide food in public schools 
during the 1900s and 1910’s (Gunderson 1971). 

It seems plausible, then that coincident with the enactment of compulsory schooling laws 
there were likely many efforts (legislative or otherwise) to improve the general condition of 
children. In this section I pursue one specific alternative hypothesis that might explain 
some of the findings. Specifically, I examine whether the association between compulsory 
schooling laws and health may have been due in part, to early century school requirements 
concerning vaccination against smallpox. 

5.2 Background on Smallpox 

Before Edward Jenner invented the first vaccine in 1797, smallpox was a widespread and 
brutal disease killing about 400,000 Europeans a year with survivors accounting for about 
one-third of all cases of blindness (Henderson and Moss 1999). Smallpox was especially 
concentrated among children, in the early 19th century smallpox accounted for one-third 
of the deaths of all children (George Palmer and Ingen 1930). More than a century after 
the development of the vaccine, smallpox remained a deadly threat in the United States. A 
report in the New England Journal of Medicine in 1930 showed that between 1919 and 1928 
there were more than half a million cases of smallpox in the US and argued that ". . .the 
United States remains now . . . the most smallpox ridden country in the world bar possibly 
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China, India and (doubtfully) Russia.” 

In addition to blindness, survivors of smallpox are also known to have a higher rate of 
encephalitis^® (inflammation of the brain). Although encephalitis is relatively rare, milder 
forms of the condition are likely to go unreported. Symptoms of encephalitis include 
problems with speech, hearing and double vision.^® This suggests that vaccination against 
smallpox in schools may have reduced the incidence of compromised sensory functions as 
we find in our SIPP sample. 

5.3 Vaccination in schools 

States began to require vaccination against smallpox in schools beginning in the late I9th 
century (Hanlon 1969). I have been able to compile information concerning state laws 
regarding school vaccination for the years 1915, 1921, 1926 and 1941. In the first snapshot 
in 1915, fourteen states had requirements for vaccination. In the other three years I found 
no cases of any additional states requiring vaccination for schools. Similarly I found no 
cases of any states repealing these laws. Therefore, I am unable to construct an analogous 
panel design as employed by Lleras-Muney for compulsory schooling laws since there is no 
variation over time. 

I also assembled data on states who had laws authorizing the use of vaccination in the 
case of outbreaks but found that these laws relied critically on enforcement. There were also 
a few cases where states changed laws regarding the prohibition of vaccination in schools but 
it is doubtful that these law changes have enough power to identify effects in the samples 
used in thsi study. 

Fortunately, a 1930 White House sponsored report on the state of young children’s 

health does contain detailed data on young children’s vaccination rates by age for 156 of the 

largest cities (George Palmer and Ingen 1930). For the time period this was an impressive 

data collection effort where information was collected from around 3000 doctors and other 
2»See AMA (1999) 

See http: / /www. ninds.nih.gov/disorders/encephalitis_meningitis/detail_encephalitis_meningitis.htm 
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health providers on the frequency of health exams and dental exams in addition to rates of 
vaccination against smallpox and immunization against diptheria. The data on smallpox 
vaccination rates was aggregated to the state level. This is displayed in Table 6. What is 
striking is how the vaccination rate jumps sharply from age four to age five in many states 
as children prepare for school entry. Although some of the richer states in the Northeast 
like New York have sizable jumps, there is a great deal of variation even within regions. For 
example, Colorado, Georgia and Kentucky are among the states with largest increases in 
vaccination between age four and age five. These data also illustrate the potential pitfall of 
using actual state laws since some of the states that ostensibly required vaccination did not 
exhibit big increases (e.g. Arkansas, South Carolina) while other states that did not legally 
require vaccination, in practice, exhibited large increases in vaccination by school age. 

Although there is only vaccination rate data for one year, 1930, I use this data to test 
the extent to which the health effects of education may be operating through differences 
in vaccination policy. Specifically, I consider how different the health effects of education 
are for states that have stringent vaccination laws versus those that don’t I assume that 
the stringency of school vaccination requirements can be proxied by the change in the 
vaccination rate from age four to age five. Obviously, this approach is only ideal for the 
youngest cohort (born in 1925), who would have turned five in 1930. I analyze this first 
for the baseline mortality sample (row 6 in Table 4). The sample is split by states with 
a change in vaccination rate of more than 10 percentage points (the median change) and 
those with a change that is 10 points or fewer. 

The results are shown in Table 7. I find that the IV estimates of the effects of schooling on 
mortality are statistically significant only in the states with large increases in vaccination by 
school age and that the IV coefficients are of the wrong sign in the states with less stringent 
vaccination requirements. I experimented with randomly splitting the sample 100 times and 
found that the odds of finding an equivalent result by chance are only about 15 percent. 

I then performed the same exercise with the SIPP sample looking at the outcomes that 
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I found to be significant for the full sample. In this sample the results are more mixed. I 
show a representative set of results from the SIPP in Table 7 For self reported health, the 
estimates are actually larger and more statistically significant in the states with relatively 
less stringent vaccination requirements. However, the effect on being in poor health is only 
apparent in the high vaccination states. Since poor health is a strong predictor of mortality 
this appears to be consistent with ther mortality finding. Most of the other estimates by 
subsample are too noisy to say much of anything but it does appear that hearing is strongly 
affected in the states with more stringent school vaccination requirements. Since I do not 
have a time series on vaccination rates and given the relative bluntness of the approach I 
only claim that these results are suggestive of a possible mechanism relating compulsory 
schooling laws and long-term health that operates through school vaccination. 

6 Conclusion 

This paper expands upon the growing literature that attempts to identify whether there is 
a causal effect of education on health by also considering how education might affect health. 
I closely examine the effects of education induced by compulsory schooling laws early in 
the twentieth century on long-term health using several approaches. First I revisit the 
results in Lleras-Muney (2005) by expanding the Census sample and employing a variety of 
robustness checks. The main finding is that the effects of education on mortality induced 
by changes in compulsory school laws are not robust to including state specific time trends. 
I also find that all of the effects are for cohorts born between 1901 and 1912 and their 
mortality experience during the 1960s. 

Second, I use the SIPP to identify not only general health effects but also specific health 
outcomes that were induced by changes in state compulsory schooling laws to see if these 
outcomes correspond to our existing theories of how education affects schooling. The re- 
sults suggest that there is a large effect of education on general health status arising from 
compulsory schooling laws that is robust to state time trends. However, I find that with 



24 




the sole exception of diabetes none of the other specific health conditions that are associated 
with education (e.g. vision, hearing, speaking ability, back problems, deformities, senility) 
correspond to the leading theories of how education improves health (e.g. technological im- 
provements, better decision-making, higher income). This suggests that either our theories 
are incorrect or that the compulsory schooling laws are suspect instruments. An impor- 
tant caveat, however, is that the SIPP analysis uses a sample of older individuals who are 
almost surely positively selected on education and health. While this likely makes it more 
difficult to detect effects of education on improved health it also raises questions as to how 
generalizable these results are. 

Third, I look at one specific alternate hypothesis of how state-level compulsory school- 
ing laws might have influenced long-term health, namely through requirements for smallpox 
vaccination as a condition for school entry. I stratify the samples by states with stringent 
versus nonstringent vaccination requirements and find that all of the effects of education 
on mortality and poor health status were registered in states with stringent vaccination 
requirements. This provides some suggestive evidence that smallpox vaccination may ac- 
count for some of the link between education and health induced by compulsory schooling 
laws. It is also worth noting that survivors of smallpox are known to suffer from compro- 
mised vision, speaking and hearing which are among the few effects that we detected in our 
IV results with the SIPP. I conclude from these exercises there is reason to be concerned 
about whether compulsory schooling laws can be used as instruments to draw meaningful 
inferences about the causal effects of education on long-term health. Instead it could well 
be that either other school-based reforms directly impacted long-term health or that other 
reforms with long term impacts took place at the same time that compulsory schooling 
requirements became more stringent. In any event, the results suggest that even if there is 
a causal effect of education on health there is still a great deal of uncertainty about how 
education improves health that should remain an important topic for further research. 
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Table 1: Summary Statistics for IPUMs samples 



1960-2000 



Variables 


1960-1980 1% only 






1%, 2% or 5% 






Mean 


Std. dev. 


N 


Mean 


Std. dev. 


N 


Ten Year death rates 


overall 


0.108 


0.136 


4792 


0.213 


0.173 


8636 


1960-70 


0.110 


0.119 


2395 


0.113 


0.105 


2397 


1970-80 


0.105 


0.152 


2397 


0.154 


0.125 


2400 


1980-90 


— 


— 


— 


0.287 


0.170 


2399 


1990-00 


- 


- 


- 


0.433 


0.122 


1440 


Individual Characteristics 


Education 


10.548 


0.990 


4795 


10.729 


1.002 


8636 


1960 Dummy 


0.471 


0.499 


4795 


0.325 


0.469 


8636 


1970 Dummy 


- 


- 


- 


0.289 


0.453 


8636 


1990 Dummy 


- 


- 


- 


0.142 


0.349 


8636 


F emale 


0.517 


0.500 


4795 


0.532 


0.499 


8636 


Age 


50.366 


8.482 


4795 


56.811 


11.287 


8636 


Born in 1905 


0.031 


0.174 


4795 


0.025 


0.157 


8636 


Born in 1910 


0.038 


0.191 


4795 


0.031 


0.174 


8636 


Born in 1915 


0.044 


0.205 


4795 


0.047 


0.211 


8636 


Born in 1920 


0.048 


0.213 


4795 


0.052 


0.222 


8636 


Born in 1925 


0.050 


0.217 


4795 


0.057 


0.232 


8636 


State of Birth Characteristics 


% Urban 


53.523 


21.279 


4795 


53.778 


21.153 


8636 


% Foreign 


11.737 


8.523 


4795 


11.562 


8.430 


8636 


% Black 


8.983 


11.901 


4795 


8.945 


11.787 


8636 


% Emp.in Mfg. 


0.067 


0.038 


4795 


0.066 


0.037 


8636 


Ann. Mfg. Wage 


7171.387 


1343.089 


4795 


7206.147 


1353.573 


8636 


Val. of Farm per Acre 


540.048 


276.353 


4795 


535.182 


272.569 


8636 


P.C. # of Doctors 


0.001 


0.000 


4795 


0.001 


0.000 


8636 


P.C Educ. Expenditures 


97.006 


42.054 


4795 


99.779 


41.706 


8636 


# Schl Bldgs/ Sq. Mile 


0.174 


0.090 


4795 


0.172 


0.090 


8636 



Notes'. Summary statistics are for state of birth, cohort and gender cells. All means and standard 
deviations use sample weights where the weights are the population estimates for the cell in the 
base period. 






Table 2: Summary Statistics for SIPP sample 



Variables 


Mean 


Std. dev. 


N 


Outcomes 








Self Reported Health 


3.084 


1.138 


26030 


Poor Health 


0.119 


0.324 


26030 


Fair or Poor Health 


0.357 


0.479 


26030 


Health Index 


67.992 


24.842 


26030 


Hospitalized in Last Year 


0.180 


0.384 


26484 


Days in Bed, last 4 months 


3.937 


17.030 


25223 


Number of Times Hospitalized 


0.282 


1.029 


22229 


Number of Nights in Hospital 


1.908 


7.898 


26274 


Trouble Seeing 


0.136 


0.342 


20853 


Trouble Hearing 


0.152 


0.359 


20845 


Trouble Speaking 


0.021 


0.144 


20834 


Trouble Lifting 


0.237 


0.425 


20837 


Trouble Walking 


0.289 


0.453 


20799 


Trouble with Stairs 


0.276 


0.447 


20820 


Trouble Getting Around Outside the Home 


0.129 


0.335 


17401 


Trouble Getting Around Inside the Home 


0.059 


0.235 


17643 


Trouble Getting In/Out of Bed 


0.079 


0.270 


17636 


Trouble Seeing at all 


0.023 


0.149 


20811 


Trouble Hearing at all 


0.013 


0.114 


20819 


Trouble Speaking at all 


0.003 


0.052 


15138 


Trouble Lifting at all 


0.115 


0.319 


20789 


Trouble Walking at all 


0.154 


0.361 


20723 


Trouble with Stairs at all 


0.116 


0.321 


20775 


Needs Help Getting Around Outside 


0.088 


0.283 


13610 


Needs Help Getting Around Inside 


0.024 


0.154 


13893 


Needs Help Getting In/Out of Bed 


0.025 


0.156 


13868 


Work limitation due to health conditions 


0.423 


0.494 


19073 


Arthritis 


0.129 


0.335 


19073 


Back 


0.062 


0.242 


19073 


Blind 


0.026 


0.159 


19073 


Gancer 


0.016 


0.125 


19073 


Deaf 


0.023 


0.149 


19073 


Deformity 


0.027 


0.162 


19073 


Diabetes 


0.030 


0.170 


19073 


Heart 


0.090 


0.287 


19073 


Hernia 


0.006 


0.080 


19073 


Hypertension 


0.036 


0.185 


19073 


Kidney 


0.005 


0.067 


19073 


Lung 


0.043 


0.203 


19073 


Mental Illness 


0.005 


0.067 


19073 


Missing Limb 


0.003 


0.056 


19073 


Paralysis 


0.006 


0.075 


19073 


Senility 


0.007 


0.084 


19073 


Stomach 


0.010 


0.099 


19073 


Stroke 


0.021 


0.144 


19073 


Thyroid 


0.003 


0.056 


19073 


Other 


0.066 


0.247 


19073 





Table 2: Summary Statistics for SIPP sample 



Variables 

Individual Characteristics 
Education 
F emale 

Age 

Notes: 



Mean Std. dev. N 



11.432 3.208 26030 

0.580 0.494 4795 

72.079 5.606 4795 






Table 3: Replicating Lleras-Muney’s Estimates of Effects of Education on Mortality 



Dependent variable is ten year mortality rate 

Lleras-Muney (2005) Replication Replication with 

Wrong Base Year 





WLS 


IV 


WLS 


IV 


WLS 


IV 


Individual Characteristics 














Education 


-0.017 


-0.051 


-0.036 


-0.072 


-0.016 


-0.059 




(0.004) 


(0.026) 


(0.004) 


(0.025) 


(0.004) 


(0.026) 


F emale 


-0.074 


-0.071 


-0.072 


-0.067 


-0.074 


-0.070 




(0.003) 


(0.004) 


(0.003) 


(0.004) 


(0.003) 


(0.004) 


State of Birth Characteristics 














% Urban 


-l.OE-04 


0.001 


9.3E-04 


0.002 


4.4E-04 


0.002 




-(9.4E-04) 


(0.001) 


(9.9E-04) 


(0.001) 


(l.OE-03) 


(0.001) 


% Foreign 


-5.6E-04 


-0.0001 


-0.001 


-0.0005 


-0.002 


-0.0011 




(0.002) 


(0.002) 


(0.002) 


(0.002) 


(0.002) 


(0.002) 


% Black 


-0.002 


-0.0009 


-8.1E-04 


-5.9E-05 


-0.001 


-5.2E-04 




(0.002) 


(0.002) 


(0.002) 


(0.002) 


(0.002) 


(0.002) 


% employed in mfg. 


-0.071 


-0.11 


-9.3E-04 


-0.039 


0.010 


-0.066 




(0.101) 


(0.108) 


(0.236) 


(0.246) 


(0.234) 


(0.246) 


Annual mfg. wage 


7.4E-07 


0.000 


3.4E-07 


4.7E-07 


3.0E-07 


5.6E-07 




(3.1E-06) 


(0.000) 


(4.1E-06) 


(4.3E-06) 


(4.0E-06) 


(4.3E-06) 


Val. of farm per acre 


2.7E-06 


0.000 


-1.2E-06 


-8.9E-06 


3.6E-06 


-4.9E-06 




(1.7E-05) 


(0.000) 


(1.9E-05) 


(2.0E-05) 


(1.8E-05) 


(2.0E-05) 


Per capita # of doctors 


0.242 


7.926 


16.394 


42.372 


-0.511 


26.405 




(13.891) 


(15.059) 


(20.993) 


(26.445) 


(20.897) 


(25.656) 


Per capita education exp. 


1.9E-05 


0.000 


5.3E-05 


6.2E-05 


4.4E-05 


4.5E-05 




(7.9E-05) 


(0.000) 


(8.3E-05) 


(9.4E-05) 


(7.6E-05) 


(8.5E-05) 


# school bldgs/sq. mile 


-0.008 


-0.005 


-0.0135001 


-0.012 


-0.015 


-0.013 




(0.062) 


(0.065) 


(0.063) 


(0.067) 


(0.062) 


(0.065) 


N 


4792 


4792 


4792 


4792 


4792 


4792 


R squared 


0.3575 


- 


0.3606 


0.3536 


0.3549 


0.3425 



Notes'. All specifications include a dummy for 1970, 24 cohort dummies, 47 state of birth dummies, region of 
birth interacted with cohort and an intercept. Estimates are weighted using the number of observations in 
the cell in the base year. Standard errors, shown in parentheses, are clustered at the state of birth and 
cohort level. 






Table 4: New Estimates of Effects of Education on Mortality 



Dependent variable is ten year mortality rate, Table entries are the Coefficient on Education 

Goldin/Katz 



Sample and specification 


WLS 


IV 


N 


Instruments 


Panel A (1960-80) 










(1) 1% 1960-1980 


-0.036 


-0.072 


4792 


— 




(0.004) 


(0.025) 






(2) 1% 1960, 3% 1970, 5% 1980 


-0.045 


-0.043 


4797 


-0.045 


drop allocated age, education, birthplace 


(0.004) 


(0.020) 




(0.024) 


(3) Sample (2) with age cubic 


-0.039 


-0.046 


4797 


-0.047 




(0.004) 


(0.020) 




(0.024) 


(4) Sample (2) with age cubic*yr 


-0.040 


-0.046 


4797 


-0.047 




(0.004) 


(0.020) 




(0.024) 


(5) Sample (2) with state*cohort trend 


-0.048 


-0.032 


4797 


-0.016 




(0.004) 


(0.021) 




(0.024) 


Panel B (1960-2000) 










(6) 1% 1960, 2% 1970, 5% 1980-00 


-0.034 


-0.035 


8636 


-0.026 


age cubic*year 


(0.003) 


(0.014) 




(0.015) 


(7) Sample (6) with state*cohort trend 


-0.036 


-0.020 


8636 


-0.012 




(0.003) 


(0.015) 




(0.016) 


Panel C (1960-2000 by year) 










(8) Sample (6) 1960 only 


-0.025 


-0.085 


2397 


-0.081 




(0.006) 


(0.045) 




(0.052) 


(9) Sample (6) 1970 only 


-0.061 


-0.022 


2400 


-0.023 




(0.005) 


(0.032) 




(0.033) 


(10) Sample (6) 1980 only 


-0.043 


-0.006 


2399 


0.023 




(0.004) 


(0.025) 




(0.029) 


(11) Sample (6) 1990 only 


-0.012 


0.021 


1440 


0.027 




(0.005) 


(0.040) 




(0.039) 


Panel D (1960-2000 by age) 










(12) Sample (6) 35-54 year olds 


-0.017 


-0.059 


2879 


-0.067 




(0.005) 


(0.040) 




(0.036) 


(13) Sample (6) 55-64 year olds 


-0.039 


-0.066 


2398 


0.063 




(0.005) 


(0.041) 




(0.053) 


(14) Sample (6) 65-89 year olds 


-0.030 


-0.005 


3071 


-0.047 




(0.003) 


(0.019) 




(0.023) 


Panel E (1960-2000 by cohort) 










(15) Sample (6) cohorts 1901-1912 


-0.019 


-0.098 


3644 


-0.203 




(0.004) 


(0.037) 




(0.125) 


(16) Sample (6) cohorts 1913-1925 


-0.017 


0.039 


4992 


0.025 




(0.004) 


(0.022) 




(0.023) 


Notes: All specifications include year dummies. 


cohort dummies, state of birth dummies, region of 



birth interacted with cohort and an intercept (except for rows 5 and 7). Estimates are weighted using 
the number of observations in the cell in the base year. Standard errors, shown in parentheses, are 
clustered at the state of birth and cohort level. 




Table 5: Estimates of Effects of Education on Health Outcomes in the SIPP 



Dependent Variable 


OLS/VroU% 


/E/2SCML 


/E/2SCML 
% effect 


exogeneity test 
p-value 


N 




Panel A: General Health Outcomes 






Self Reported Health 


-0.0941 


-0.2289 


-0.074 


0.074 


26030 


( 1 is excellent, 5 is poor) 


(0.0023) 


(0.0745) 








Health Index (1 to 100 scale) 


1.9674 


4.5345 


0.067 


0.131 


26030 




(0.0511) 


(1.6738) 








Fair or Poor Health 


-0.0359 


-0.0824 


-0.230 


0.176 


26030 




(0.0010) 


(0.0343) 








Poor Health 


-0.0141 


-0.0269 


-0.226 


0.533 


26030 




(0.0006) 


(0.0206) 








Hospitalized in Last Year 


-0.0049 


-0.0268 


-0.149 


0.364 


26484 




(0.0008) 


(0.0241) 








Days in Bed, last 4 months 


-0.3310 


2.1526 


0.547 


0.074 


25223 




(0.0364) 


(1.4848) 








Number of Times Hospitalized 


-0.0101 


-0.0944 


-0.335 


0.329 


22229 




(0.0024) 


(0.0884) 








Number of Nights in Hospital 


-0.0730 


-1.0828 


-0.567 


0.185 


26289 




(0.0186) 


(0.7668) 









Panel B: Functional Limitations/ADL/IADL 



Trouble Seeing 


-0.0122 

(0.0007) 


-0.0559 

(0.0254) 


-0.412 


0.085 


20853 


Trouble Hearing 


-0.0103 

(0.0007) 


-0.0499 

(0.0247) 


-0.329 


0.109 


20845 


Trouble Speaking 


-0.0019 

(0.0002) 


-0.0192 

(0.0079) 


-0.909 


0.039 


20573 


Trouble Lifting 


-0.0198 

(0.0009) 


-0.0055 

(0.0330) 


-0.023 


0.667 


20837 


Trouble Walking 


-0.0251 

(0.0011) 


0.0130 

(0.0325) 


0.045 


0.242 


20797 


Trouble with Stairs 


-0.0250 

(0.0010) 


-0.0066 

(0.0324) 


-0.024 


0.993 


20820 


Trouble Getting Around 
Outside the Home 


-0.0120 

(0.0008) 


-0.0146 

(0.0257) 


-0.114 


0.918 


17401 


Trouble Getting Around 
Inside the Home 


-0.0048 

(0.0005) 


0.0051 

(0.0208) 


0.087 


0.635 


17463 


Trouble Getting In/ 
Out of Bed 


-0.0056 

(0.0006) 


0.0013 

(0.0230) 


0.016 


0.764 


17621 


Trouble Seeing at all 


-0.0020 


-0.0078 


-0.343 


0.490 


20589 




Table 5: Estimates of Effects of Education on Health Outcomes in the SIPP 



Dependent Variable 


OLS/VroU% 


/E/2SGML 


/E/2SCML 
% effect 


exogeneity test 
p-value 


N 




(0.0002) 


(0.0084) 








Trouble Hearing at all 


-0.0008 


-0.0100 


-0.758 


0.060 


20256 




(0.0001) 


(0.0045) 








Trouble Speaking at all 


0.0000 


-0.0008 




0.000 


7516 




(0.0001) 


** 








Trouble Lifting at all 


-0.0100 


-0.0029 


-0.025 


0.775 


20789 




(0.0007) 


(0.0250) 








Trouble Walking at all 


-0.0148 


0.0107 


0.069 


0.328 


20723 




(0.0008) 


(0.0260) 








Trouble with Stairs at all 


-0.0114 


0.0071 


0.061 


0.359 


20775 




(0.0006) 


(0.0202) 








Needs Help 


-0.0066 


0.0044 


0.050 


0.470 


13598 


Getting Around Outside 


(0.0007) 


(0.0153) 








Needs Help 


-0.0010 


0.0108 


0.446 


0.125 


13757 


Getting Around Inside 


(0.0002) 


(0.0078) 








Needs Help 


-0.0011 


0.0092 


0.372 


0.191 


13794 


Getting In/Out of Bed 


(0.0003) 


(0.0080) 










Panel C: Specific Health Gonditions 






Difficulty 


-0.0250 


-0.0743 


-0.175 


0.157 


19073 




(0.0013) 


(0.0348) 








Arthritis 


-0.0088 


-0.0043 


-0.034 


0.836 


19012 




(0.0008) 


(0.0217) 








Back 


-0.0028 


-0.0349 


-0.561 


0.061 


18924 




(0.0005) 


(0.0167) 








Blind 


-0.0014 


0.0145 


0.557 


0.060 


18454 




(0.0003) 


(0.0084) 








Gancer 


-0.0007 


0.0025 


0.161 


0.677 


18569 




(0.0002) 


(0.0078) 








Deaf 


-0.0003 


-0.0041 


-0.179 


0.568 


18422 




(0.0002) 


(0.0064) 








Deformity 


-0.0006 


-0.0159 


-0.591 


0.018 


18821 




(0.0002) 


(0.0066) 








Diabetes 


-0.0023 


-0.0258 


-0.868 


0.007 


18688 




(0.0003) 


(0.0082) 








Heart 


-0.0062 


-0.0014 


-0.016 


0.804 


19025 




(0.0006) 


(0.0194) 








Hernia 


-0.0003 


0.0023 


0.362 


0.454 


17179 




(0.0001) 


(0.0037) 










Table 5: Estimates of Effects of Education on Health Outcomes in the SIPP 



7E/2SCML exogeneity test 



Dependent Variable 


OLS/'PxoUt 


/E/2SCML 


% effect 


p-value 


N 


Hypertension 


- 0.0031 


0.0376 


1.053 


0.000 


18683 




( 0 . 0004 ) 


( 0 . 0124 ) 








Kidney 


- 0.0001 


0.0042 


0.938 


0.072 


16593 




( 0 . 0001 ) 


( 0 . 0027 ) 








Lung 


- 0.0037 


0.0203 


0.472 


0.106 


19060 




( 0 . 0005 ) 


(0.0152) 








Mental Illness 


- 0.00009 


-0.0002 


-0.045 


0.932 


15794 




( 0 . 00008 ) 


(0.0424) 








Missing Limb 


- 0.00007 


-0.0019 


-0.580 


0.155 


14565 




( 0 . 00005 ) 


(0.0016) 








Paralysis 


- 0.00011 


0.0016 


0.287 


0.348 


17301 




( 0 . 00006 ) 


(0.0020) 








Senility 


- 0.00005 


- 0.0015 


-0.214 


0.070 


17993 




( 0 . 00002 ) 


( 0 . 0006 ) 








Stomach 


- 0.0006 


0.0069 


0.695 


0.195 


17701 




( 0 . 0002 ) 


(0.0060) 








Stroke 


- 0.0008 


0.0084 


0.397 


0.295 


18918 




( 0 . 0003 ) 


(0.0090) 








Thyroid 


- 0.0000001 


0.000001 


0.000 


0.000 


14559 




( 0 . 000000 ) 


** 








Other 


- 0.0023 


-0.0013 


-0.019 


0.947 


19060 




( 0 . 0005 ) 


(0.0152) 










Table 6: Small Pox Vaccination Rates of Young Children in 1930 



state at age 4 


at age 5 


change, age 4 to 5 


% change, age 4 to 


Alabama 


9 


15 


6 


66.7 


Alaska 


NA 


NA 


NA 


NA 


Arizona. 


22 


25 


3 


13.6 


Arkansas. 


5 


23 


18 


360.0 


California. 


23 


33 


10 


43.5 


Colorado. 


13 


53 


40 


307.7 


Connecticut 


35 


65 


30 


85.7 


Delaware. 


8 


4 


-4 


-50.0 


DC 


14 


35 


21 


150.0 


Florida 


NA 


NA 


NA 


NA 


Georgia 


30.5 


56.5 


26 


85.2 


Hawaii 


33 


38 


5 


15.2 


Idaho 


11 


18 


7 


63.6 


Illinois. 


16 


26 


10 


62.5 


Indiana 


13 


14 


1 


7.7 


Iowa. 


18 


26 


8 


44.4 


Kansas 


14 


26 


12 


85.7 


Kentucky. 


20.5 


50 


29.5 


143.9 


Louisiana 


23 


46 


23 


100.0 


Maine. 


28 


50 


22 


78.6 


Maryland. 


34 


60 


26 


76.5 


Massachusetts 


25 


62 


37 


148.0 


Michigan. 


17 


25 


8 


47.1 


Minnesota 


10 


15 


5 


50.0 


Mississippi. 


21.5 


31.5 


10 


46.5 


Missouri 


21 


37 


16 


76.2 


Montana 


9 


8 


-1 


-11.1 


Nebraska 


16 


15 


-1 


-6.3 


Nevada. 


28 


28 


0 


0.0 


NH 


28 


76 


48 


171.4 


NJ 


25 


53 


28 


112.0 


NM 


NA 


NA 


NA 


NA 


NY 


23 


63 


40 


173.9 


NC 


3.5 


10 


6.5 


185.7 


ND 


33 


37 


4 


12.1 


Ohio. 


15 


34 


19 


126.7 


Oklahoma 


19 


27 


8 


42.1 


Oregon 


13 


15 


2 


15.4 


Pennsylvania 


9 


29 


20 


222.2 


RI 


51 


86 


35 


68.6 


SC 


11 


17 


6 


54.5 


SD 


25 


40 


15 


60.0 


Tennessee. 


10 


23 


13 


130.0 


Texas 


13 


27 


14 


107.7 


Utah. 


13 


13 


0 


0.0 


Vermont 


NA 


NA 


NA 


NA 


Virginia 


10 


16 


6 


60.0 


Washington 


14.5 


24.5 


10 


69.0 


WV 


NA 


NA 


NA 


NA 


Wisconsin 


18 


27 


9 


50.0 


Wyoming. 


NA 


NA 


NA 


NA 


Median 


17.0 


27.0 


10.0 


66.7 




Table 7: IV Estimates of Mortality and Health by Stringency of Compulsory Vaccination Laws 





Mortality 


Health 

Index 


Poor 

Health 


Trouble 

Seeing 


Trouble 

Hearing 


Trouble 

Speaking 


Back 


Stiffness or 
deformity 


Senility 


Baseline 


-0.028 


4.535 


-0.035 


-0.057 


-0.053 


-0.022 


-0.035 


-0.021 


-0.013 


Sample 


(0.014) 


(1.674) 


(0.024) 


(0.028) 


(0.027) 


(0.012) 


(0.019) 


(0.012) 


(0.007) 




8636 


26030 


26030 


20853 


20845 


20834 


19073 


19073 


19073 


Sample 


-0.025 


4.854 


-0.046 


-0.050 


-0.052 


-0.020 


-0.032 


-0.020 


-0.012 


with Vacc data 


(0.015) 


(1.590) 


(0.023) 


(0.027) 


(0.026) 


(0.011) 


(0.019) 


(0.012) 


(0.007) 




7736 


24958 


24958 


20045 


20036 


20027 


18371 


18371 


18371 


Stringent States 


-0.023 


3.348 


-0.059 


-0.029 


-0.038 


-0.002 


-0.018 


-0.014 


-0.001 




(0.012) 


(1.804) 


(0.027) 


(0.030) 


(0.024) 


(0.007) 


(0.020) 


(0.012) 


(0.006) 




3600 


13841 


13841 


11225 


11219 


11207 


10417 


10417 


10417 


Non-Stringent 


0.020 


3.774 


0.001 


-0.023 


0.026 


-0.009 


-0.030 


-0.016 


-0.022 


States 


(0.027) 


(1.760) 


(0.029) 


(0.033) 


(0.032) 


(0.012) 


(0.029) 


(0.018) 


(0.014) 




4136 


11117 


11117 


8820 


8817 


8820 


7954 


7954 


7954 



Notes: All IV regressions include female dummy, cohort dummies, state of birth dummies, and 7 time varying state of birth 
characteristics (at age 14) from Lleras-Muney (2005). These are % urban, % foreign born, %black, %mfg, mfg wage, doctors per-capita, 
education expenditures per-capita, and schools per sq. mile. The mortality results also use region of birth interacted with cohort while the 
SIPP results use state-specific cohort trends. Instruments are categories of required years of schooling in state of residence at age 14. 
Standard errors are clustered on state of birth and cohort 
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Figure 1 : Ten Year Mortality Rates by Age Across Census Years 
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Figure 2: Vaccination Rates for 4 and 5 year olds by state 
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